Selfish  Routing  and  the  Price  of  Anarchy 


Tim  Roughgarden* 
January  7,  2006 


Abstract 

Selfish  routing  is  a  classical  mathematical  model  of  how  self-interested  users  might 
route  traffic  through  a  congested  network.  The  outcome  of  selfish  routing  is  generally 
inefficient,  in  that  it  fails  to  optimize  natural  objective  functions.  The  price  of  anarchy 
is  a  quantitative  measure  of  this  inefficiency. 

We  survey  recent  work  that  analyzes  the  price  of  anarchy  of  selfish  routing.  We 
also  describe  related  results  on  bounding  the  worst-possible  severity  of  a  phenomenon 
called  Braess’s  Paradox,  and  on  three  techniques  for  reducing  the  price  of  anarchy  of 
selfish  routing.  This  survey  concentrates  on  the  contributions  of  the  author’s  PhD 
thesis,  but  also  discusses  several  more  recent  results  in  the  area. 


1  Introduction 

Over  the  past  several  years,  there  has  been  a  tremendous  surge  of  activity  at  the  interface 
of  computer  science  and  economics.  This  survey  is  a  brief  introduction  to  two  intertwined 
facets  of  this  emerging  research  area,  the  price  of  anarchy  and  selfish  routing.  The  price  of 
anarchy,  first  defined  by  Koutsoupias  and  Papadimitriou  [59,  69],  measures  the  extent  to 
which  competition  approximates  cooperation.  It  is  motivated  by  the  well-known  fact  that 
noncooperative  equilibria  can  be  inefficient,  in  that  they  need  not  optimize  natural  objective 
functions  [33,  75].  Selfish  routing  refers  to  a  mathematical  model  of  traffic  in  a  congested 
network.  This  model  has  a  long  history  in  the  transportation  science  literature  [8,  13,  72,  100] 
and  has  also  been  widely  studied  by  the  computer  networking  community  (see  e.g.  [11,  16, 
42,  43,  67,  73]).  The  price  of  anarchy  has  recently  been  extensively  studied  in  this  model. 

This  survey  concentrates  on  the  contributions  of  the  author’s  PhD  thesis  [82],  but  also 
discusses  several  more  recent  results  on  the  price  of  anarchy  of  selfish  routing.  In  most  cases, 
we  provide  self-contained  proofs.  Many  more  details,  results,  and  references  can  be  found  in 
the  recent  book  [86],  which  is  an  expanded  and  revised  version  of  [82], 
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(a)  Pigou’s  example  (b)  A  nonlinear  variant 

Figure  1:  Pigou’s  example  and  a  nonlinear  variant.  The  cost  function  c(x )  describes  the  cost 
incurred  by  users  of  an  edge,  as  a  function  of  the  amount  of  traffic  routed  on  the  edge. 


1.1  Two  Motivating  Examples 

We  now  introduce  selfish  routing  and  motivate  the  results  described  in  this  survey  by  infor¬ 
mally  exploring  two  important  examples.  Pigou  discovered  the  first  example  in  1920  [72]; 
Braess  found  the  second  in  1968  [14], 

Example  1.1  (Pigou’s  example  [72])  Consider  the  simple  network  shown  in  Figure  1(a). 
Two  disjoint  edges  connect  a  source  vertex  s  to  a  sink  vertex  t.  Each  edge  is  labeled  with  a 
cost  function  c(-),  which  describes  the  cost  (e.g.,  travel  time)  incurred  by  users  of  the  edge, 
as  a  function  of  the  amount  of  traffic  routed  on  the  edge.  The  upper  edge  has  the  constant 
cost  function  c(x )  =  1,  and  thus  represents  a  route  that  is  relatively  long  but  immune  to 
congestion.  The  cost  of  the  lower  edge,  by  contrast,  is  governed  by  the  function  c(x)  =  x 
and  thus  increases  as  the  edge  gets  more  congested.  In  particular,  the  lower  edge  is  cheaper 
than  the  upper  edge  if  and  only  if  less  than  one  unit  of  traffic  uses  it. 

Suppose  there  is  one  unit  of  traffic,  representing  a  very  large  population  of  network  users, 
and  that  each  user  chooses  independently  between  the  two  routes  from  s  to  t.  Assuming  that 
each  network  user  aims  to  minimize  its  cost,  we  should  expect  all  traffic  to  follow  the  lower 
edge.  Indeed,  each  network  user  should  reason  as  follows:  the  lower  route  is  never  worse 
than  the  upper  one,  even  when  it  is  fully  congested,  and  it  is  superior  whenever  some  of  the 
other  users  are  foolish  enough  to  take  the  upper  route.  In  the  “selfish  routing  outcome” ,  we 
therefore  expect  all  networks  users  to  incur  one  unit  of  cost. 

Now  suppose  that,  by  whatever  means,  we  can  choose  how  the  traffic  is  routed.  Can 
we  leverage  this  power  to  improve  over  the  selfish  routing  outcome?  To  see  that  we  can, 
consider  assigning  half  of  the  traffic  to  each  of  the  two  routes.  The  network  users  forced  onto 
the  upper  edge  experience  one  unit  of  cost,  and  are  thus  no  worse  off  than  in  the  previous 
outcome.  On  the  other  hand,  users  permitted  on  the  lower  edge  now  enjoy  lighter  traffic 
conditions,  and  incur  a  mere  1/2  unit  of  cost.  We  have  therefore  lowered  the  cost  of  half  of 
the  users  while  making  no  one  worse  off.  Moreover,  the  average  cost  incurred  by  traffic  has 
decreased  from  1  to  3/4. 
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Pigou’s  example  demonstrates  that  selfish  routing  need  not  produce  an  optimal  outcome. 
This  phenomenon  can  be  amplified  with  a  seemingly  minor  modification  to  Example  1.1. 
Suppose  we  replace  the  previously  linear  cost  function  c(x )  =  x  with  the  highly  nonlinear 
one  c(x )  =  xp  for  p  large  (Figure  1(b)).  As  in  Example  1.1,  selfish  users  will  all  travel  on  the 
lower  route,  incurring  a  cost  of  1.  On  the  other  hand,  if  we  could  force  a  small  e  fraction  of 
the  traffic  to  travel  along  the  upper  route,  then  the  average  cost  would  drop  to  e  +  (1  —  e)p+1, 
which  approaches  0  as  e  tends  to  0  and  p  tends  to  infinity. 

In  Section  2,  we  will  define  the  price  of  anarchy  of  selfish  routing  as  the  average  cost 
of  traffic  in  a  selfish  outcome  divided  by  the  minimum-possible  average  cost.  If  the  price 
of  anarchy  of  a  network  is  close  to  1,  then  we  conclude  that  the  negative  impact  of  selfish 
routing  is  relatively  small.  The  price  of  anarchy  in  Example  1.1  is  at  least  4/3,  and  it  tends 
to  infinity  with  p  in  the  nonlinear  variant  of  Pigou’s  example. 

The  price  of  anarchy  of  selfish  routing  can  therefore  be  large  if  the  network  cost  functions 
are  “sufficiently  nonlinear” .  Pigou’s  example  and  its  nonlinear  variant  motivate  the  following 
questions,  which  are  central  to  Section  2  of  this  survey.  Can  the  price  of  anarchy  be  large 
even  when  cost  functions  are  “not  too  nonlinear”?  Is  the  price  of  anarchy  larger  in  bigger, 
more  complicated  networks?  Is  it  larger  in  multicommodity  networks,  where  traffic  emanates 
from  and  terminates  at  multiple  locations?  In  Section  2  we  will  prove  that  the  answer  to  all 
of  these  questions  is  “no’fein  fact,  Pigou’s  example  and  simple  variants  are  in  some  sense 
universal  bad  examples  for  the  price  of  anarchy  of  selfish  routing. 

While  the  price  of  anarchy  in  our  next  example  is  no  larger  than  in  Pigou’s  example,  it 
is  arguably  a  more  startling  and  unintuitive  display  of  the  suboptimality  of  selfish  routing. 

Example  1.2  (Braess’s  Paradox  [14])  Consider  the  four-node  network  shown  in  Fig¬ 
ure  2(a).  There  are  two  disjoint  routes  from  s  to  f,  each  with  combined  cost  1  +x,  where  x  is 
the  amount  of  traffic  that  uses  the  route.  The  routes  are  therefore  identical,  and  selfish  traffic 
should  split  evenly  between  them.  Assuming  that  there  is  one  unit  of  traffic,  all  network 
users  experience  3/2  units  of  cost  in  the  selfish  routing  outcome. 

Now  suppose  that,  in  an  effort  to  decrease  the  cost  encountered  by  the  traffic,  we  build 
a  short,  high-capacity  edge  connecting  the  midpoints  of  the  two  existing  routes.  The  new 
network  is  shown  in  Figure  2(b),  with  the  new  edge  (n,rw)  possessing  the  constant  cost 
function  c(x)  =  0.  How  will  selfish  traffic  react? 

We  cannot  expect  the  previous  traffic  pattern  to  persist  in  the  new  network.  As  in  Pigou’s 
example,  the  cost  of  the  new  route  s  — >  v  — >  w  — >  t  is  never  worse  than  that  along  the  two 
original  paths,  and  it  is  strictly  less  whenever  some  traffic  fails  to  use  it.  We  therefore  expect 
all  network  users  to  deviate  to  the  new  route.  Because  of  the  ensuing  heavy  congestion 
on  the  edges  ( s,v )  and  all  of  the  traffic  now  experiences  two  units  of  cost.  Braess’s 

Paradox  thus  shows  that  the  intuitively  helpful  action  of  adding  a  new  zero-cost  edge  can 
increase  the  cost  experienced  by  all  of  the  traffic! 

Example  1.2  shows  that  adding  a  new  edge  to  a  network  can  increase  the  cost  incurred  by 
selfish  traffic.  Equivalently,  removing  one  edge  from  a  network  with  linear  cost  functions  can 
decrease  this  cost  by  a  factor  of  at  least  4/3.  Can  removing  edges  from  a  network  decrease 
the  cost  incurred  by  selfish  traffic  by  a  larger  factor  in  larger  networks,  or  with  nonlinear 
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Figure  2:  Braess’s  Paradox.  The  addition  of 
all  of  the  traffic. 


(b)  Augmented  network 

intuitively  helpful  edge  can  adversely  affect 


cost  functions,  or  with  multiple  commodities,  or  with  multiple  edge  removals  allowed?  If  so, 
by  how  much?  In  Section  3  we  give  precise  answers  to  all  of  these  questions. 

1.2  Overview 

We  begin  in  Section  2  by  proving  matching  upper  and  lower  bounds  on  the  price  of  anarchy  of 
selfish  routing.  After  defining  the  classical  model  of  selfish  routing  that  we  study,  we  formalize 
the  lower  bound  on  the  price  of  anarchy  provided  by  simple  variants  of  Pigou’s  example.  As 
suggested  by  Example  1.1  and  the  subsequent  nonlinear  variants,  this  lower  bound  will 
depend  on  the  set  of  allowable  edge  cost  functions.  We  then  show  a  matching  upper  bound 
for  essentially  every  set  of  allowable  cost  functions.  For  example,  the  price  of  anarchy  in 
every  multicommodity  network  with  linear  cost  functions — functions  of  the  form  ax  +  b  with 
o,  b  >  0 — is  at  most  4/3.  Thus  the  price  of  anarchy  in  such  networks  is  maximized  by 
Pigou’s  example  (Example  1.1).  Similarly,  the  price  of  anarchy  of  multicommodity  networks 
with  cost  functions  that  are  polynomials  with  nonnegative  coefficients  and  degree  at  most 
p  is  maximized  by  the  nonlinear  variant  of  Pigou’s  example  shown  in  Figure  1(b).  We  also 
explicitly  compute  the  largest-possible  price  of  anarchy  with  respect  to  several  different  types 
of  cost  functions. 

Section  3  studies  the  worst-possible  severity  of  Braess’s  Paradox.  We  show  that  Braess’s 
Paradox  can  be  arbitrarily  severe,  even  in  single-commodity  networks,  provided  nonlinear 
cost  functions,  large  networks,  and  multiple  edge  removals  are  permitted.  Precisely,  for  every 
n  >  2,  there  is  a  single-commodity,  n-vertex  network  such  that  removing  [n/ 2j  —  1  edges 
decreases  the  cost  incurred  by  selfish  traffic  by  an  [n/2\  factor.  We  also  show  that  this 
construction  is  optimal  in  several  senses,  discuss  extensions  to  multicommodity  networks, 
and  show  that  Braess’s  Paradox  is  impossible  to  detect  efficiently  (assuming  P  ^  NP ). 

Section  4  tackles  the  problem  of  reducing  the  price  of  anarchy  in  networks  where  it  is 
unacceptably  high.  It  surveys  positive  results  for  three  distinct  approaches:  increasing  the 
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network  capacity,  routing  a  small  portion  of  the  traffic  centrally,  and  influencing  network 
users  by  taxing  network  edges. 

Finally,  Section  5  describes  the  broader  research  context  for  the  results  of  this  survey, 
and  discusses  other  recent  work  that  quantifies  the  inefficiency  of  noncooperative  equilibria. 

2  Bounding  the  Price  of  Anarchy 

This  section  formally  defines  selfish  routing  networks,  equilibria,  and  the  price  of  anarchy 
(Subsection  2.1);  introduces  a  simple  lower  bound  on  the  price  of  anarchy  that  is  based 
on  Pigou’s  example  (Subsection  2.2);  and  proves  a  matching  upper  bound  on  the  price  of 
anarchy  (Subsection  2.3). 

2.1  Preliminaries 

Selfish  Routing  Networks 

We  begin  by  reviewing  the  terminology  of  classical  multicommodity  flow  networks.  See  [2], 
for  example,  for  more  details  and  for  historical  notes  on  network  flows.  A  multicommodity 
flow  network  is  described  by  a  directed  graph  G  =  (V,E),  with  vertex  set  V  and  edge  set 
E,  and  a  set  (si,  ti), . . . ,  (sk,  tk)  of  source-sink  vertex  pairs,  also  called  commodities.  Parallel 
edges  are  allowed,  and  a  vertex  can  participate  in  multiple  source-sink  pairs. 

For  a  multicommodity  network  G,  let  Vi  denote  the  set  of  simple  paths  and  V  the 
union  Uf=17V  We  always  assume  that  Vi  0  for  every  i.  A  flow  in  G  is  a  nonnegative 
vector,  indexed  by  V.  For  a  flow  /  and  a  path  P  £  Vi,  we  interpret  fP  as  the  amount  of 
traffic  of  commodity  i  that  chooses  the  path  P  to  navigate  from  Si  to  t,.  A  flow  /  induces 
a  flow  on  edges  {/e}ee£o  where  fe  =  ^2PeV.eeP  fp  denotes  the  total  amount  of  flow  using 
the  edge  e.  Finally,  we  use  r  to  denote  a  nonnegative  vector  of  traffic  rates ,  indexed  by  the 
commodities  of  G.  A  flow  /  in  G  is  feasible  for  r  if  it  routes  all  of  the  prescribed  traffic:  for 
each  i  e  {1, 2, . . . ,  A;},  fp  =  P- 

To  model  the  negative  consequences  of  increasing  congestion,  we  give  each  edge  e  of  a 
network  G  a  nonnegative,  continuous,  and  nondecreasing  cost  function  ce.  A  cost  function 
ce(-)  denotes  the  cost  (e.g.  travel  time)  incurred  by  traffic  that  traverses  edge  e,  as  a  function 
of  the  edge  congestion  fe.  A  selfish  routing  network  is  then  given  by  a  triple  of  the  form 
(G,r,  c),  where  G  is  a  multicommodity  flow  network,  r  is  a  vector  of  traffic  rates,  and  c  is  a 
vector  of  cost  functions,  indexed  by  the  edges  of  G.  We  often  call  such  a  triple  an  instance. 

Equilibria 

We  next  discuss  equilibria  in  selfish  routing  networks.  Let  /  be  a  flow  feasible  for  the  instance 
(G,r,  c).  The  overall  cost  cP(f )  incurred  by  traffic  on  the  path  P  in  the  flow  /  is  defined  as 
the  sum  of  the  costs  of  the  constituent  edges:  cP(f )  =  J2eePCeife)-  Naturally,  we  expect 
selfish  traffic  to  attempt  to  minimize  its  cost.  This  leads  to  the  following  definition,  which 
was  first  formulated  by  Wardrop  [100]. 
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Definition  2.1  ([100])  Let  /  be  a  feasible  flow  for  the  instance  (G,r, c).  The  flow  /  is  a 
Rardrop  equilibrium  if,  for  every  commodity  i  G  {1,2,....,./;}  and  every  pair  P,  P  e  Pj  of 
Sj-tj  paths  with  fP  >  0, 

cp(/)  <  Cp(/). 

In  other  words,  all  paths  in  use  by  a  Wardrop  equilibrium  /  have  minimum-possible  cost 
(given  their  source,  sink,  and  the  congestion  caused  by  /).  In  particular,  all  paths  of  a  given 
commodity  used  by  a  Wardrop  equilibrium  have  equal  cost.  In  the  theoretical  computer 
science  literature,  Wardrop  equilibria  are  also  called  Nash  flows.  Haurie  and  Marcotte  [47] 
formalized  the  precise  correspondence  between  Wardrop  equilibria  and  Nash  equilibria  of 
finite  normal-form  games  [66]. 

Remark  2.2  In  Definition  2.1,  we  are  implicitly  assuming  that  every  network  user  controls 
a  negligible  portion  of  the  overall  traffic,  so  that  the  actions  of  an  individual  user  have 
essentially  no  effect  on  the  network  congestion.  In  the  game  theory  literature,  games  with 
this  property  are  called  nonatomic  [91].  Several  recent  papers  have  analyzed  the  price  of 
anarchy  in  atomic  variants  of  the  selfish  routing  model  studied  in  this  survey;  see  Section  5 
for  references. 

Beckmann,  McGuire,  and  Winsten  [8]  resolved  the  important  issues  of  existence  and 
uniqueness  of  Wardrop  equilibria. 

Proposition  2.3  ([8])  Let  ( G ,  r,  c)  be  an  instance. 

(a)  The  instance  ( G,r,  c )  admits  at  least  one  Wardrop  equilibrium. 

(b)  If  f  and  f  are  Wardrop  equilibria  for  (G,r,  c),  then  ce(fe )  =  ce(/e)  for  every  edge  e. 

The  first  part  of  Proposition  2.3  guarantees  that  a  Wardrop  equilibrium  exists  in  every  in¬ 
stance.  The  second  part  states  that  every  two  Wardrop  equilibria  induce  identical  edge  costs. 
While  two  Wardrop  equilibria  need  not  induce  identical  flows  on  edges,  Proposition  2.3(b) 
is  strong  enough  for  our  purposes. 

The  proof  of  Proposition  2.3  in  [8]  is  remarkable.  Beckmann,  McGuire,  and  Winsten  [8] 
showed,  by  invoking  the  Karush-Kuhn- Tucker  conditions  (see  e.g.  [71]),  that  the  Wardrop 
equilibria  of  an  instance  (G,  r,  c )  are  precisely  the  flows  that  minimize  the  potential  function 

$(/)  =  [  °e(x)dx  (!) 

e£E 

over  all  feasible  flows  for  (G,r,  c).  Since  cost  functions  are  continuous  and  the  space  of 
all  flows  is  compact,  Weierstrass’s  Theorem  then  implies  Proposition  2.3(a).  Since  cost 
functions  are  nondecreasing,  the  function  <I>  is  convex,  and  Proposition  2.3(b)  then  follows 
without  much  difficulty.  This  use  of  a  potential  function  has  been  influential  in  both  game 
theory  and  theoretical  computer  science.  Led  by  the  work  of  Rosenthal  [77]  and  Monderer 
and  Shapley  [65],  potential  functions  have  become  a  standard  tool  in  noncooperative  game 
theory  for  proving  the  existence  of  pure-strategy  Nash  equilibria  in  certain  classes  of  games. 
In  theoretical  computer  science,  potential  functions  have  been  used  to  bound  the  price  of 
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anarchy  in  several  applications  [3,  52,  88,  89].  Intuitively,  if  equilibria  optimize  a  potential 
function  that  is  “close  to”  the  objective  function,  then  equilibria  cannot  be  too  inefficient. 
Indeed,  the  proximity  between  the  potential  function  <f>  in  (1)  and  our  objective  function  (3) 
below  implies  near-optimal  upper  bounds  on  the  price  of  anarchy  of  selfish  routing  [89].  In 
this  survey  we  focus  only  on  optimal  bounds,  however,  which  follow  from  a  different  proof 
approach. 

Speaking  of  which,  the  following  variational  inequality  characterization  of  Wardrop  equi¬ 
libria,  due  to  Smith  [93],  will  play  a  crucial  role  in  our  upper  bound  on  the  price  of  anarchy. 

Proposition  2.4  ([93])  A  flow  f  feasible  for  ( G ,  r,  c)  is  a  Wardrop  equilibrium  if  and  only 

if 

Y°e(fe)fe  <  J2Ce{fe)fe 

e£E  e£E 

for  every  flow  f*  feasible  for  ( G ,  r,  c). 

Proposition  2.4  can  easily  be  derived  as  an  optimality  condition  for  minimizers  of  the  po¬ 
tential  function  (1).  For  simplicity,  we  instead  give  a  short  direct  proof. 

Proof:  Definition  2.1  easily  implies  that  a  flow  /  is  a  Wardrop  equilibrium  if  and  only  if 

Y  <  Y  °p(f)fp  (2) 

pen  P£V 

for  every  flow  f*  feasible  for  (G,r,  c).  Writing  cP(f)  =  J2eepce(fe)  and  reversing  the  order 
of  summation  on  both  sides  of  (2)  then  proves  the  proposition.  ■ 

The  Price  of  Anarchy 

We  conclude  the  preliminaries  by  defining  the  price  of  anarchy.  Since  this  definition  aims  to 
quantify  the  inefficiency  of  an  equilibrium,  it  requires  an  objective  function.  We  adopt  the 
usual  objective  function  from  min-cost  network  flow,  and  define  the  cost  C(f)  of  a  flow  /  in 
(■ G,r,c )  as 

C(f)  =  £  Cp(f  )fp  =  £  Ce(te)U  (3) 

P£P  e£E 

The  first  equality  in  (3)  is  a  definition;  the  second  follows  from  the  same  reversal  of  sums 
as  in  the  proof  of  Proposition  2.4.  A  flow  feasible  for  an  instance  (G,  r,  c)  is  optimal  if  it 
minimizes  the  cost  over  all  feasible  flows.  Because  cost  functions  are  continuous  and  the 
space  of  flows  is  compact,  every  instance  admits  an  optimal  flow. 

We  now  define  the  price  of  anarchy  as  the  ratio  between  the  cost  of  a  Wardrop  equilibrium 
and  of  an  optimal  flow. 

Definition  2.5  ([59,  69])  The  price  of  anarchy  p(G,r,c)  of  an  instance  (G,r,  c)  is 

p(G,r,c)  = 

where  /  is  a  Wardrop  equilibrium  and  f*  is  an  optimal  flow  for  (G,  r,  c).  The  price  of  anarchy 
p(I)  of  a  non-empty  set  I  of  instances  is  sup(G  r )C)ex/?(G,  r,  c). 
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Definition  2.1  and  Proposition  2.3(b)  easily  imply  that  all  Wardrop  equilibria  have  equal 
cost,  and  thus  the  price  of  anarchy  of  an  instance  is  well  defined  unless  there  is  a  flow  with 
zero  cost.  In  this  case,  all  Wardrop  equilibria  also  have  zero  cost,  and  we  define  the  price  of 
anarchy  of  the  instance  to  be  1. 


2.2  The  Pigou  Bound 

Definition  of  the  Pigou  Bound 

Pigou’s  example  and  its  nonlinear  variant  (Subsection  1.1)  show  that  the  price  of  anarchy 
of  selfish  routing  depends,  at  the  very  least,  on  the  type  of  cost  functions  allowed.  We 
will  therefore  aim  for  a  bound  on  the  price  of  anarchy  that  is  parameterized  by  the  set  of 
allowable  cost  functions,  and  that  is  optimal  for  each  such  set.  Common  examples  of  sets  of 
cost  functions  include  linear  functions,  polynomials,  and  queueing  delay  functions. 

For  every  set  C  of  allowable  cost  functions,  Pigou-like  examples  provide  a  natural  lower 
bound  on  the  price  of  anarchy  of  instances  with  cost  functions  in  C.  Specifically,  suppose  C 
contains  all  of  the  constant  cost  functions,  and  choose  a  cost  function  c2  G  C  and  a  traffic 
rate  r  >  0.  Let  c\  E  C  denote  the  cost  function  everywhere  equal  to  c2(r).  Consider  the 
usual  two-node,  two-link  network  of  Pigou’s  example  (Figure  1),  give  the  upper  and  lower 
edges  the  cost  functions  c\  and  c2,  respectively,  and  set  the  traffic  rate  to  be  r.  Routing  all 
traffic  on  the  lower  edge  yields  a  Wardrop  equilibrium  with  cost  c2(r)r.  The  price  of  anarchy 
in  this  instance  is  thus 

r  •  c2(r 

max  - - - — 

0<x<r  X  ■  C2{X)  +  (r  - 

Definition  2.6  below  uses  this  expression  but  does  not  constrain  x  from  above  by  r;  since  c2 
is  nondecreasing,  this  modification  does  not  affect  the  value  of  the  maximum. 

We  can  now  obtain  a  lower  bound  on  the  price  of  anarchy  by  choosing  the  cost  function 
c2  and  the  traffic  rate  r  in  the  most  pernicious  way  possible. 


Definition  2.6  ([27,  83])  Let  C  be  a  nonempty  set  of  cost  functions. 
a(C)  for  C  is 


a(C)  =  sup  sup 


c(r) 


cec  x,r>o  X  ■  c(x)  +  (r  -  x)c(r) 


The  Pigou  bound 


(4) 


with  the  understanding  that  0/0  =  1. 


Examples 

While  the  defining  equation  (4)  of  the  Pigou  bound  may  appear  fearsome  to  evaluate,  it 
simplifies  to  a  closed-form  expression  for  many  interesting  sets  of  cost  functions. 

Example  2.7  ([83,  89])  If  C  =  {ax  +  b  :  a,  6  >  0}  is  the  set  of  linear  cost  functions,  then 
elementary  calculations  show  that  a(C )  =4/3. 

Thus  Example  1.1  determines  the  Pigou  bound  for  linear  cost  functions. 

Example  2.8  ([27])  Similarly,  if  C  is  the  set  of  concave  cost  functions,  then  a(C)  =  4/3. 
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Example  2.9  ([83])  If  C  is  the  set  of  polynomials  with  nonnegative  coefficients  and  degree 
at  most  p,  then 

a(C)  =  [1  —  p  ■  (p  +  1)“(p+1)/p]_1.  (5) 

As  p  grows  large,  the  right-hand  side  of  (5)  tends  to  infinity  as  p/\np  [83,  99]. 

The  right-hand  side  of  (5)  is  simply  the  price  of  anarchy  in  the  nonlinear  variant  of  Pigou’s 
example  discussed  in  Subsection  1.1.  The  Pigou  bound  for  (nondecreasing)  bounded-degree 
polynomials  with  arbitrary  coefficients  is  not  well  understood,  though  partial  results  have 
recently  been  obtained  by  So  [94]. 

Remark  2.10  One  of  the  most  popular  types  of  cost  functions  in  transportation  science 
applications  is  quartic  functions  with  nonnegative  coefficients  (see  e.g.  Sheffi  [92]).  The 
Pigou  bound  (5)  for  such  functions  is  roughly  2.15. 

Our  final  example  is  for  the  delay  functions  of  M/M/1  queues — queues  with  Poisson 
arrivals  and  exponentially  distributed  service  times — which  are  common  in  computer  network 
applications  (see  e.g.  [10,  11]).  These  delay  functions  correspond  to  cost  functions  of  the  form 
c(x)  =  l/(u  —  x),  where  u  can  be  interpreted  as  an  edge  capacity  or  a  queue  service  rate. 
The  value  of  such  a  function  is  defined  to  be  +oo  when  x  >  u.  Allowing  infinite  costs 
requires  some  technical  modifications  to  the  selfish  routing  model  that  we  ignore  in  this 
survey;  see  [82]  for  more  details. 

The  Pigou  bound  for  the  set  of  M/M/1  delay  functions  is  +oo  [42],  Intuitively,  this 
follows  from  Example  2.9  because  an  M/M/1  delay  function  behaves  like  a  polynomial  with 
arbitrarily  large  degree  when  it  is  nearly  saturated.  In  analogy  to  restricting  the  polynomial 
degree  in  Example  2.9,  we  impose  a  lower  bound  umin  on  all  queue  service  rates  and  an 
upper  bound  Rmax  on  the  value  that  the  traffic  rate  r  can  take  on  in  (4). 

Example  2.11  ([83])  Suppose  Rmax  <  umin  and  let  C  =  {(u  —  x)  1  :  u  >  umin}  be  the 
set  of  M/M/1  delay  functions  with  service  rate  at  least  umin-  Let  a(C)  denote  the  largest- 
possible  price  of  anarchy  in  Pigou-like  networks  with  cost  functions  in  C  and  traffic  rate  at 
most  Rmax •  (Formally,  a(C)  is  given  by  (4)  with  the  additional  restriction  that  x,  r  <  Rmax-) 
Then 

a(C)  =  2  (1  + 

The  right-hand  side  of  (6)  tends  to  infinity  as  Rmax  Umin,  but  is  bounded  by  a  constant 
if  Rmax  is  at  most  a  constant  fraction  of  iirnva . 

Simple  Worst-Case  Networks 

The  Pigou  bound  uses  only  simple  networks  to  provide  a  lower  bound  on  the  price  of  anarchy. 
Specifically,  the  next  proposition  follows  immediately  from  the  definition  of  the  bound. 

Proposition  2.12  ([83])  Let  C  be  a  set  of  cost  functions  that  includes  all  of  the  constant 
functions,  and  let  I  denote  the  single-commodity  instances  with  a  two-node,  two-link  network 
and  cost  functions  in  C.  Then 

p{T)  >  a(C). 
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Figure  3:  Worst-case  networks  for  inhomogeneous  sets  of  cost  functions.  The  number  of 
paths  and  the  number  of  edges  in  each  path  can  be  arbitrarily  large. 


If  the  set  C  does  not  contain  all  of  the  constant  cost  functions,  then  we  can  obtain  similar 
results  using  modestly  more  complex  networks.  For  example,  suppose  the  set  C  of  cost 
functions  is  diverse  in  the  sense  that  (c(0)  :  c  E  C}  =  [0,oo).  Then  an  edge  with  the 
constant  cost  function  c(x)  =  a  can,  for  all  practical  purposes,  be  “simulated”  by  a  large 
number  of  parallel  edges  that  each  have  a  cost  function  satisfying  c(0)  =  a.  This  observation 
means  that  a  Pigou-like  network  with  a  constant  cost  function  can  be  replaced  by  a  network 
with  two  nodes,  (an  unrestricted  number  of)  parallel  links,  and  cost  functions  in  C  without 
affecting  the  price  of  anarchy. 

Proposition  2.13  ([83])  Let  C  be  a  diverse  set  of  cost  functions,  and  let  X  denote  the 
single- commodity  instances  with  a  network  of  parallel  links  and  cost  functions  in  C.  Then 

p(l)  >  a(C). 

The  set  of  cost  functions  in  Example  2.11  is  not  diverse  when  umin  >  0,  but  it  is  inhomoge¬ 
neous  in  the  sense  that  it  contains  a  function  c  with  c(0)  >  0.  As  in  Proposition  2.13,  the 
Pigou  bound  remains  valid  for  such  sets  of  cost  functions  provided  we  allow  somewhat  more 
complex  networks.  Specifically,  let  a  union  of  paths  mean  a  network  with  one  source,  one 
sink,  and  an  arbitrarily  large  number  of  internally  vertex-disjoint  paths  directed  from  the 
source  to  the  sink  (Figure  3). 

Proposition  2.14  ([83])  Let  C  be  an  inhomogeneous  set  of  cost  functions,  and  let  X  denote 
the  single-commodity  instances  with  a  network  that  is  a  union  of  paths  and  with  cost  functions 
in  C.  Then 

p{X)  >a(C). 

The  idea  of  the  proof  of  Proposition  2.14  is  to  impose  diversity  by  considering  the  closure  C 
of  C  under  multiplication  by  positive  scalars,  apply  Proposition  2.13,  and  use  multiple  copies 
of  edges  with  cost  functions  in  C  to  simulate  edges  with  cost  functions  in  C. 

Remark  2.15  The  Pigou  bound  does  not  apply  to  homogeneous  sets  C  of  cost  functions, 
where  c(0)  =  0  for  all  c  E  C.  The  price  of  anarchy  of  selfish  routing  is  not  completely 
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understood  for  such  sets;  the  upper  bound  in  the  next  subsection  holds  for  these  sets,  but 
it  is  not  optimal.  See  [28]  for  refined  upper  bounds  on  the  price  of  anarchy  with  respect  to 
homogeneous  sets  of  sufficiently  low-degree  polynomials. 


2.3  Optimality  of  the  Pigou  Bound 

With  all  the  preliminaries  in  place,  we  can  now  easily  prove  an  upper  bound  on  the  price  of 
anarchy  of  selfish  routing  that  matches  the  Pigou  bound.  For  convenience,  we  first  state  a 
lemma  that  follows  immediately  from  Definition  2.6. 


Lemma  2.16  Let  C  be  a  set  of  cost  functions  and  a(C)  the  Pigou  bound  for  C.  For  c  €  C 
and  x,  r  >  0, 

,  .  r  •  c(r)  ,  .  ,  . 

X  ■  c[x)  >  — — —  +  {x-  r)c(r). 
a{C) 

We  now  use  this  lemma  and  the  variational  inequality  of  Proposition  2.4  to  prove  the 
optimality  of  the  Pigou  bound. 


Theorem  2.17  ([27,  83])  Let  C  be  a  set  of  cost  functions  and  a(C)  the  Pigou  bound  for  C. 
If  (G,  r,  c)  is  an  instance  with  cost  functions  in  C,  then 

p(G,  r,  c)  <  a(C). 

Proof:  Let  f*  and  /  be  an  optimal  flow  and  a  Wardrop  equilibrium,  respectively,  for  an 
instance  (G,  r,  c)  with  cost  functions  in  the  set  C.  The  theorem  follows  by  writing 

C(/-)  =  £ce(/;)/; 

eeE 

>  G(fe)fe  +  ^2(fe  ~  fe)Ce(fe) 

eeE  eeE 

>  C(f) 

~  *(cy 

where  for  the  hrst  inequality  we  have  applied  Lemma  2.16  to  each  edge  e  with  x  =  f*  and 
r  =  /e,  and  the  second  inequality  follows  from  Proposition  2.4.  ■ 

Theorem  2.17  implies  that  the  lower  bounds  on  the  price  of  anarchy  in  Examples  2.7- 
2.11  are  the  best  possible.  Thus  the  price  of  anarchy  of  networks  with  linear  (or  concave) 
cost  functions  is  precisely  4/3;  the  price  of  anarchy  of  networks  with  cost  functions  that 
are  polynomials  with  nonnegative  coefficients  and  degree  at  most  p  is  precisely  the  right- 
hand  side  of  (5);  and  the  price  of  anarchy  of  instances  with  sum  of  all  traffic  rates  at  most 
Rmaxi  cost  functions  that  are  M/M/1  delay  functions,  and  service  rates  bounded  below  by 
umin  >  Rmax  is  precisely  the  right-hand  side  of  (6). 

Moreover,  since  the  Pigou  bound  is  based  only  on  the  simplest  of  instances,  the  matching 
upper  bound  of  Theorem  2.17  implies  that  simple  networks  always  furnish  worst-possible 
examples  of  the  inefficiency  of  selfish  routing.  Precisely,  Propositions  2.12-2.14  and  Theo¬ 
rem  2.17  give  the  following  corollary. 
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Corollary  2.18  Let  C  be  a  set  of  cost  functions. 

(a)  If  C  contains  the  constant  functions,  then  the  price  of  anarchy  of  instances  with  cost 
functions  in  C  is  achieved,  up  to  an  arbitrarily  small  factor,  by  a  single-commodity 
instance  with  a  two-node,  two-link  network. 

(b)  II  C  is  diverse,  then  the  price  of  anarchy  of  instances  with  cost  functions  in  C  is 
achieved,  up  to  an  arbitrarily  small  factor,  by  a  single- commodity  instance  with  a  net¬ 
work  of  parallel  links. 

( c )  If  C  is  inhomogeneous,  then  the  price  of  anarchy  of  instances  with  cost  functions  in  C 
is  achieved,  up  to  an  arbitrarily  small  factor,  by  a  single- commodity  instance  with  a 
network  that  is  a  union  of  paths. 

Informally,  Corollary  2.18  states  that  the  price  of  anarchy  is  controlled  only  by  the  set  of 
allowable  cost  functions,  and  is  essentially  independent  of  the  number  of  commodities  and 
of  the  complexity  of  the  allowable  network  topologies. 

Remark  2.19  Theorem  2.17  has  undergone  several  iterations  in  just  a  few  short  years.  It 
was  first  proved  for  the  special  case  of  linear  cost  functions  in  Roughgarden  and  Tardos  [89]. 
Roughgarden  [79]  then  proved  Theorem  2.17  for  bounded-degree  polynomials  with  nonneg¬ 
ative  coefficients.  The  proof  in  [79]  was  fairly  complex  and  did  not  explicitly  take  advantage 
of  the  variational  inequality  given  in  Proposition  2.4.  Roughgarden  [81]  extended  this  proof 
and  established  Theorem  2.17  for  all  sets  of  cost  functions  that  satisfy  a  weak  technical  con¬ 
dition  (met  by  essentially  all  cost  functions  that  arise  in  applications).  Ronen  [76]  pointed 
out  that  Proposition  2.4  could  be  used  to  vastly  simplify  the  proof  of  Theorem  2.17,  under 
the  same  technical  condition.  This  revised  analysis  appears  in  [83].  Correa,  Schulz,  and  Stier 
Moses  [27]  then  showed  that,  once  the  proof  is  based  on  Proposition  2.4,  it  can  be  modified 
so  that  no  technical  conditions  whatsoever  are  needed.  The  proof  of  Theorem  2.17  given 
above  is  taken  from  [27].  More  recently,  two  more  proofs  of  Theorem  2.17  have  been  given 
by  Tardos  [96]  and  Correa,  Schulz,  and  Stier  Moses  [28]. 

3  Bounding  Braess’s  Paradox 

This  section  studies  the  worst-possible  severity  of  Braess’s  Paradox.  Subsection  3.1  gives  a 
construction  that  shows  that  the  severity  of  Braess’s  Paradox  can  grow  with  the  network 
size  when  nonlinear  cost  functions  and  multiple  edge  removals  are  permitted.  Subsection  3.2 
proves  matching  upper  bounds  for  single-commodity  networks.  Subsection  3.3  gives  a  brief 
overview  of  Braess’s  Paradox  in  multicommodity  networks.  Finally,  Subsection  3.4  presents 
negative  results  for  the  computational  problem  of  efficiently  detecting  Braess’s  Paradox. 

3.1  A  Bigger  Braess’s  Paradox 

The  discovery  of  Braess’s  Paradox  [14]  immediately  intrigued  researchers  and  catalyzed  nu¬ 
merous  research  directions  (see  [78]  for  a  survey).  However,  nearly  all  of  this  work  focused 
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on  Braess’s  original  four-node  network  (Figure  2)  and  variants  thereof.  We  next  show  that 
Braess’s  original  example  is  merely  the  tip  of  the  iceberg:  Braess’s  Paradox  can  be  arbitrarily 
severe  in  large  single-commodity  networks. 

We  measure  the  severity  of  Braess’s  Paradox  with  the  Braess  ratio — the  factor  by  which 
the  cost  of  a  Wardrop  equilibrium  exceeds  that  of  an  equilibrium  in  a  subnetwork. 


Definition  3.1  The  Braess  ratio  f3(G,r,c )  of  a  single-commodity  instance  ( G,r,c )  is 


P(G,r,  c) 


C(f) 

max  , 

HCG  C(fH ) 


(7) 


where  H  ranges  over  subnetworks  of  G  that  contain  an  s-t  path,  and  /  and  fH  denote 
Wardrop  equilibria  for  (G,r,c)  and  (H,r,c),  respectively. 

As  with  Definition  2.5,  the  Braess  ratio  of  an  instance  (G,  r,  c)  is  well  defined  unless  it  admits 
a  flow  with  zero  cost,  in  which  case  we  define  fi(G,  r,  c)  to  be  1. 


Remark  3.2  For  now,  we  only  define  the  Braess  ratio  for  single-commodity  networks.  There 
are  multiple  ways  to  extend  Definition  3.1  to  multicommodity  networks;  see  Subsection  3.3 
for  details. 

The  Braess  ratio  in  Example  1.2  is  4/3.  No  larger  Braess  ratio  is  possible  in  single¬ 
commodity  networks  with  linear  cost  functions.  This  fact  is  a  consequence  of  the  following 
close  connection  between  the  price  of  anarchy  and  the  Braess  ratio. 


Proposition  3.3  //(G,r,  c )  is  a  single- commodity  instance,  then 

P(G,r,  c)  <  p(G,r,c). 

Proof:  For  every  subgraph  H  of  G,  a  Wardrop  equilibrium  fH  of  ( H ,  r,  c)  is  a  feasible  flow 
for  (G,r,  c);  by  the  definition  of  the  price  of  anarchy,  the  cost  of  fH  is  at  most  a  p(G,r,  c) 
factor  less  than  that  of  a  Wardrop  equilibrium  for  (G,  r,  c).  ■ 

As  promised,  Theorem  2.17  and  Proposition  3.3  imply  that  every  single-commodity  in¬ 
stance  with  linear  cost  functions  has  a  Braess  ratio  of  at  most  4/3.  The  upper  bound  in 
Proposition  3.3  is  also  tight,  up  to  constant  factors,  for  many  other  types  of  cost  functions 
(see  Remark  3.6  below). 

Exhibiting  a  family  of  instances  with  arbitrarily  large  Braess  ratios  requires  a  new,  more 
complicated  construction  than  those  we  have  seen  so  far.  Proposition  3.3  implies  that  such  a 
family  must  make  use  of  cost  functions  drawn  from  a  sufficiently  rich  set  (such  as  polynomials 
with  unbounded  degree).  We  encountered  one  such  family  in  the  nonlinear  variant  of  Pigou’s 
example  (Subsection  1.1),  but  it  is  easy  to  see  that  all  of  these  instances  have  a  Braess  ratio 
of  1.  There  is  also  an  analogous  nonlinear  variant  of  Example  1.2,  obtained  by  replacing  the 
linear  cost  functions  on  the  edges  ( s,v )  and  (w,t)  with  the  functions  c(x )  =  xp  for  p  large. 
The  Braess  ratios  of  these  instances  approach  2  as  p  — >  oo.  As  we  will  see  in  Subsection  3.2, 
a  Braess  ratio  larger  than  2  cannot  arise  without  allowing  larger  networks  and  multiple  edge 
removals. 

Our  main  result  in  this  subsection  is  the  following. 
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(a)  B2 


(b)  B3 


Figure  4:  The  second  and  third  Braess  graphs.  Edges  are  labeled  with  their  types. 


Theorem  3.4  ([78])  For  every  n>  2,  there  is  a  single- commodity  instance  ( G,r,c )  with  n 
vertices  and 


P(G,  r,  c)  > 


Proof:  We  can  assume  that  n  is  even,  since  the  odd  case  reduces  to  the  even  case  by  adding 
an  isolated  vertex.  We  can  also  assume  that  n  is  at  least  4.  Write  n  =  2k  +  2  for  a  positive 
integer  k. 

We  next  define  the  kth  Braess  graph  Bk .  Start  with  a  set  of  2k  +  2  vertices  Vk  = 
{s,  Vi, . . . ,  Vk,  Wi, . . . ,  Wk,t}.  The  edge  set  Ek  is  the  union  of  the  sets  {(s,  Vi ),  (u*,  wf),  ( wi:t )  : 
1  <  i  <  k},  {(vi,Wi- 1)  :  2  <  i  <  k},  and  {(ui,t)}  U  {(s,^)}  (see  Figure  4).  Call  edges  of 
the  form  {vi,wf)  the  type  A  edges ,  edges  of  the  form  1),  (s,Wk),  and  (vi,t)  the  type 

B  edges ,  and  edges  of  the  form  (s,u*)  and  (wi,t)  the  type  C  edges  (see  Figure  4).  Note  that 
B 1  is  the  graph  in  the  original  Braess’s  Paradox  (Figure  2(b)). 

Define  cost  functions  on  the  edges  of  Bk  as  follows. 


(A)  Type  A  edges  are  given  the  cost  function  ck(x)  =  0. 

(B)  Type  B  edges  are  given  the  cost  function  ck(x)  =  1. 

(C)  For  each  *£{1,2,...,  k},  the  type  C  edges  (wi,t)  and  (s,  vk~i+i)  are  given  a  continuous, 
nondecreasing  cost  function  ck{x)  with  ck(k/(k  +  1))  =  0  and  ck(  1)  =  i. 


For  i  =  1, . . . ,  k,  let  Pi  denote  the  path  s  — >  — >  t.  For  *  =  2, . . . ,  k,  let  Qi  denote 

the  path  s  — >  u,  — >  w^- 1  — >■  t.  Define  Qi  to  be  the  path  s  — >  v\  — )■  t  and  Qk+ 1  the  path 
s  — y  wk  — y  t.  On  one  hand,  routing  one  unit  of  flow  on  each  of  P\, ...  ,Pk  yields  a  Wardrop 
equilibrium  /  for  (Bk7  k,  ck)  in  which  all  traffic  incurs  cost  k  +  1  (Figure  5(a)).  On  the  other 
hand,  if  H  is  the  subgraph  obtained  from  Bk  by  deleting  the  k  type  A  edges,  then  routing 
k/(k  + 1)  units  of  flow  on  each  of  Qi, . . . ,  Qk+i  yields  a  Wardrop  equilibrium  fH  for  (if,  k,  ck) 
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(a)  Wardrop  equilibrium  in  ( B3,3,c 3) 


(b)  Wardrop  equilibrium  in  the  optimal  sub¬ 
graph 


Figure  5:  Proof  of  Theorem  3.4,  when  k  =  3.  Solid  edges  carry  traffic  in  the  Wardrop 
equilibrium,  dashed  edges  do  not.  Edge  costs  are  with  respect  to  the  Wardrop  equilibrium. 


in  which  all  traffic  incurs  only  one  unit  of  cost  (Figure  5(b)).  Thus 


P(G,  r,  c)  >  C(f)/C(f»)  =  k  +  1  =  n/2, 


completing  the  proof.  ■ 


Remark  3.5  In  the  proof  of  Theorem  3.4,  the  subgraph  H  was  obtained  from  Bk  by  re¬ 
moving  k  edges.  Thus,  for  every  positive  integer  k1  there  is  a  single-commodity  instance  for 
which  removing  k  edges  can  decrease  the  cost  of  a  Wardrop  equilibrium  by  a  factor  of  k  +  1. 

Remark  3.6  The  construction  in  the  proof  of  Theorem  3.4  can  also  be  adapted  to  scenarios 
where  arbitrary  cost  functions  are  not  allowed.  For  example,  suppose  cost  functions  are 
restricted  to  be  polynomials  with  nonnegative  coefficients  and  degree  at  most  p.  Consider 
the  instance  ( Bk ,  k,c)1  where  k  ~  p/lnp,  and  where  the  cost  functions  c  for  Bk  are  identical 
to  those  in  the  proof  of  Theorem  3.4,  except  that  a  type  C  edge  of  the  form  or 

(s,  Vk-i+ 1)  receives  the  cost  function  ixp.  Arguing  as  in  the  proof  of  Theorem  3.4  shows  that 
the  Braess  ratio  of  ( Bk,k,c )  is  Q(k)  =  Q(p/\np)  as  p  — >  oo.  This  Braess  ratio  matches, 
up  to  a  constant  factor,  the  upper  bound  for  this  set  of  cost  functions  that  follows  from 
Theorem  2.17  and  Proposition  3.3.  See  [78]  for  more  details  and  further  examples. 

3.2  A  Matching  Upper  Bound 

This  subsection  shows  that,  among  single-commodity  networks,  the  Braess  ratio  is  maximized 
by  the  networks  constructed  in  the  proof  of  Theorem  3.4. 
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Theorem  3.7  ([78])  If{G,r,c )  is  a  single- commodity  instance  with  n  vertices,  then 

77 

P(G,r,c)  <  y-  . 

Following  Lin,  Roughgarden,  and  Tardos  [61],  we  will  obtain  Theorem  3.7  as  a  con¬ 
sequence  of  a  more  general  theorem.  The  statement  of  this  more  general  result  uses  the 
following  definition. 

Definition  3.8  Let  (G,r,  c)  be  a  single-commodity  instance  and  S  a  subset  of  the  edges 
of  G.  The  set  S  is  sparse  if  no  two  edges  of  S  share  an  endpoint,  and  in  addition  no  edge  of 
S  is  incident  to  s  or  t. 

In  other  words,  a  set  of  edges  is  sparse  if  and  only  if  they  form  an  (undirected)  matching  of 

n  {«,*}• 

Our  general  bound  on  Braess’s  Paradox  states  that  the  size  of  the  largest  sparse  set 
removed  controls  how  much  the  cost  of  a  Wardrop  equilibrium  can  decrease. 

Theorem  3.9  ([61])  Let  ( G ,  r,  c)  be  a  single- commodity  instance,  H  a  subgraph  of  G,  and 
f  and  f  Wardrop  equilibria  for  ( G,r,c )  and  (H,r,c),  respectively.  Let  S  denote  the  edges  in 
G  but  not  H .  If  every  sparse  subset  of  S  contains  at  most  k  edges,  then 

C(f)  <(k  +  1)  •  C(f). 

Before  proving  Theorem  3.9,  we  show  that  it  easily  implies  Theorem  3.7,  as  well  as  an 
upper  bound  on  the  severity  of  Braess’s  Paradox  that  is  parameterized  by  the  number  of 
edges  removed. 

Proof  of  Theorem  3. 1:  Since  there  are  only  n  —  2  vertices  of  G  that  are  not  s  or  t,  every 
sparse  set  of  edges  has  at  most  [(n  —  2)/2j  =  [n/2]  —  1  edges.  Theorem  3.9  now  implies  the 
theorem.  ■ 

The  next  corollary  implies  that  the  only  way  to  achieve  arbitrarily  large  Braess  ratios  is 
to  allow  an  unlimited  number  of  edge  removals,  answering  a  question  of  Kameda  [53]. 

Corollary  3.10  ([61])  Removing  k  edges  from  a  single- commodity  network  decreases  the 
cost  of  a  Wardrop  equilibrium  by  at  most  a  factor  of  k  - f  1. 

Proof:  Obvious  from  Definition  3.8  and  Theorem  3.9.  ■ 

In  particular,  we  noted  earlier  that  simple  nonlinear  variants  on  Braess’s  original  example 
achieve  a  Braess  ratio  arbitrarily  close  to  2;  if  only  a  single  edge  removal  is  allowed,  then  no 
single-commodity  instance  has  a  larger  Braess  ratio.  More  generally,  the  construction  in  the 
proof  of  Theorem  3.4  matches  the  bound  of  Corollary  3.10  for  every  k  (see  Remark  3.5). 

We  now  turn  toward  the  proof  of  Theorem  3.9.  This  proof  will  be  more  delicate  than  the 
upper  bounds  on  the  price  of  anarchy  given  in  Section  2.  In  particular,  our  proof  must  be 
sensitive  to  the  number  of  network  vertices,  whereas  two-node  networks  typically  determine 
the  price  of  anarchy  (Corollary  2.18).  Because  of  this,  our  techniques  will  have  a  much 
stronger  combinatorial  flavor  than  those  in  Section  2. 
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Because  cost  functions  are  arbitrary  in  Theorem  3.9,  we  take  a  modest  approach  to  lower 
bounding  the  cost  of  a  Wardrop  equilibrium  /  in  the  original  network  relative  to  that  of  a 
Wardrop  equilibrium  /  in  a  subnetwork.  We  will  identify  edges  on  which  /  routes  at  least 
as  much  traffic  as  /.  Since  cost  functions  are  nondecreasing,  the  cost  incurred  by  /  on  these 
edges  is  at  least  that  incurred  by  /.  The  next  definition  is  largely  motivated  by  this  idea. 

Definition  3.11  Let  /  and  /  be  feasible  flows  for  the  instance  (G,r,c). 

(a)  An  edge  e  of  G  is  (fj)-light  if  fe  <  fe  and  fe  >  0,  (fj)-heavy  if  fe  >  fe,  and 
(/,  f) -useless  if  fe  =  fe  =  0. 

(b)  An  undirected  path  is  (/,  /)- alternating  if  it  comprises  only  forward  (/,  /)-light  edges 
and  backward  (/,  /)- heavy  edges. 

When  the  context  is  clear,  we  drop  the  dependence  on  /  and  /  from  the  terms  in  Defini¬ 
tion  3.11. 

Example  3.12  Consider  the  Braess’s  Paradox  network  (Figure  2(b)).  Let  /  be  the  Wardrop 
equilibrium  and  /  the  optimal  flow,  which  splits  the  traffic  evenly  between  the  paths  s  — > 
v  — >  t  and  s  -»  w  — >  t.  Then,  edges  (s,  u),  (v,w),  and  (w,t)  are  (/, /)-heavy  while  edges 
(s,w)  and  (v,t)  are  (/,  /)-light.  The  unique  (/,  /)- alternating  s-t  path  is  s  — >  w  — *  v  ->  t. 

The  next  lemma  states  that,  for  every  pair  of  feasible  flows,  an  s-t  alternating  path  exists. 
It  is  an  easy  consequence  of  flow  conservation  arguments. 

Lemma  3.13  Let  f  and  f  be  flows  feasible  for  the  single- commodity  instance  (G,r,c).  Then, 
there  is  an  (/,  /) -alternating  s-t  path.  Moreover,  if  f  is  directed  acyclic,  then  every  such  path 
begins  and  ends  with  an  (/, /) -light  edge. 

Proof:  Suppose  for  contradiction  that  there  is  no  (/,  /)-alternating  s-t  path  and  let  S  denote 
the  set  of  nodes  reachable  from  s  via  such  paths.  The  set  S  contains  s  and,  by  assumption, 
does  not  contain  t.  Since  S  is  an  s-t  cut,  the  net  /-flow  and  /-flow  exiting  S  is  precisely  r. 

Since  vertices  in  S  can  be  reached  from  s  via  (/,  //alternating  paths  and  vertices  outside 
S  cannot,  edges  that  exit  S  cannot  be  (/,  //light,  and  edges  that  enter  S  cannot  be  (/,  /)- 
heavy.  Since  the  net  flow  across  S  is  positive  (assuming  r  >  0),  some  non-useless  (and  thus 
(/,  //heavy)  edge  exits  S.  Taken  together,  these  facts  imply  that  the  net  /-flow  exiting  S 
is  strictly  greater  than  the  net  /-flow  exiting  S ,  a  contradiction. 

Moreover,  if  /  is  directed  acyclic,  then  it  sends  no  flow  into  s  or  out  of  t.  Thus,  the  first 
and  last  edges  of  every  (/,  //alternating  s-t  path  must  be  (/,  //light.  ■ 

Our  proof  of  Theorem  3.9  will  proceed  by  induction  along  an  alternating  path,  repeatedly 
using  the  shortest-path  structure  of  a  Wardrop  equilibrium.  This  structure  is  summarized 
by  the  following  characterization  of  such  equilibria. 

Lemma  3.14  Let  f  be  a  flow  feasible  for  the  single-commodity  instance  ( G,r,c ).  For  a 
vertex  v  in  G,  let  d(v)  denote  the  length,  with  respect  to  edge  lengths  ce(fe),  of  a  shortest  s-v 
path  in  G.  Then 

d(w)  -  d(v)  <  ce(fe) 
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for  every  edge  e  =  (v,w),  and  f  is  a  Wardrop  equilibrium  if  and  only  if  equality  holds 
whenever  fe>  0. 

Lemma  3.14  follows  from  Definition  2.1  and  basic  properties  of  shortest  paths. 

Lemma  3.14,  or  the  fact  that  Wardrop  equilibria  minimize  the  potential  function  in  (1), 
easily  implies  the  following  strengthening  of  Proposition  2.3  for  single-commodity  instances. 

Lemma  3.15  Every  single- commodity  instance  admits  a  Wardrop  equilibrium  that  is  a  di¬ 
rected  acyclic  flow. 

Details  of  the  proofs  of  Lemmas  3.14  and  3.15  can  be  found  in  [78,  86].  We  are  finally 
prepared  to  prove  Theorem  3.9. 

Proof  of  Theorem  3.9:  Let  /  be  a  directed  acyclic  Wardrop  equilibrium  for  (G,  r,  c)  and  /  a 
Wardrop  equilibrium  for  (H1  r,  c).  We  view  /  as  a  flow  in  the  larger  network  G  in  the  obvious 
way.  For  a  vertex  v,  let  d{v)  denote  the  shortest-path  distance  from  s  to  r  with  respect  to 
edge  lengths  ce(/e)  in  G,  and  d(v)  the  s-v  distance  with  respect  to  ce(/e)  in  H.  Note  that 
Definition  2.1  and  the  definition  of  cost  (3)  imply  that  G(/)  =  r  •  d(t)  and  G(/)  =  r  •  d(t)1  so 
the  theorem  reduces  to  proving  that  d(t)  <  (k  +  1)  •  d(t)1  where  k  is  the  size  of  some  sparse 
set  of  edges  present  in  G  but  not  H. 

Let  P  be  an  (/, /)-alternating  s-t  path,  which  exists  by  Lemma  3.13.  A  segment  of  P 
is  a  maximal  subpath  of  P  that  contains  only  (/,  /)-light  or  only  (/,  /)- heavy  edges.  Edges 
that  are  in  G  but  not  H  are  called  absent.  Since  fe>  0  on  (/,  /)-light  edges,  absent  edges 
can  only  reside  in  (/,  /)- heavy  segments.  The  key  claim  is  that  if  v  is  a  vertex  at  the  end  of 
a  segment  of  P,  and  i  (heavy)  segments  of  P  between  s  and  v  contain  an  absent  edge,  then 

d(v)  <  d(v)  +  i  ■  d(t).  (8) 

This  claim  implies  the  theorem.  To  see  why,  first  apply  (8)  to  t  to  obtain 

d(t)  <  d{t)  +  k  ■  d{t)  =  (k  +  1)  •  d(t),  (9) 

where  k  is  the  number  of  segments  of  P  that  include  an  absent  edge.  Inequality  (9)  reduces 
the  proof  of  the  theorem  to  exhibiting  a  sparse  set  of  k  absent  edges.  Since  /  is  a  directed 
acyclic  flow,  Definition  3.11  and  Lemma  3.13  imply  that  the  (/,/)- heavy  segments  of  P 
are  disjoint  from  each  other  and  from  s  and  t.  Picking  one  absent  edge  from  each  of  the  k 
(/,  /)- heavy  segments  of  P  that  contain  one  thus  provides  the  desired  sparse  set. 

We  now  prove  (8)  by  induction  on  the  segments  of  P.  The  inequality  trivially  holds  when 
v  =  s,  so  suppose  it  holds  for  a  vertex  v  that  is  last  on  a  segment  of  P,  or  is  the  source  s. 
We  wish  to  prove  (8)  for  w1  defined  as  the  last  vertex  on  the  next  segment.  Let  i  denote 
the  number  of  earlier  segments  of  P  that  contain  at  least  one  absent  edge.  By  the  inductive 
hypothesis,  d(v)  <  d(v)  +  i  ■  d(t). 

The  inductive  step  has  two  cases.  For  the  first  case,  suppose  that  the  segment  between 
v  and  w  contains  at  least  one  absent  edge.  As  absent  edges  can  only  be  (/,  /)- heavy,  this 
segment  comprises  only  (/,  /)- heavy  backward  edges.  Since  there  is  a  path  of  (heavy)  edges 
from  w  to  v,  each  carrying  /-flow,  Lemma  3.14  implies  that  d(w)  <  d(v).  Since  the  path  P 
begins  with  an  (/,  /)-light  edge  (Lemma  3.13),  v  s  and  there  is  an  (/,  /)-light  edge  entering 
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v.  Since  /  routes  flow  into  v,  it  must  route  flow  from  v  to  t.  By  Lemma  3.14,  d(v)  <  d(t). 
Combining  what  we  know  with  the  inductive  hypothesis,  the  proof  of  the  inductive  step  is 
complete: 

d(w)  <  d(v)  <  d(v)  +  i  ■  d(t)  <  (i  +  1)  •  d(t)  <  d(w)  +  (i  +  1)  •  d(t). 

For  the  second  case  of  the  inductive  step  for  (8),  suppose  that  the  current  segment  Q  C  P 
contains  no  absent  edges.  We  will  prove,  by  induction  on  the  vertices  of  Q.  that 

d(x)  <  d(x)  +  i  ■  d(t)  (10) 

for  all  vertices  x  of  Q.  The  base  case  ( x  =  v)  follows  from  the  outer  inductive  hypothesis  (8). 
For  the  (inner)  inductive  step,  suppose  d(x)  <  d(x)  +  i  ■  d(t)  for  a  vertex  x  of  Q  and  let  y 
denote  the  next  vertex  on  the  segment. 

If  the  edge  (x,y)  €  P  is  (/,/)- light,  then  ce(/e)  <  ce(/e)  and  fe  >  0.  Since  /  and  /  are 
Wardrop  equilibria,  Lemma  3.14  and  the  inductive  hypothesis  imply  that 

d(y)  <  d(x)  +  ce(/e)  <  d(x)  +  ce(/e)  +  i  ■  d(t)  =  d(y)  +  %  ■  d(t), 


which  establishes  (10)  for  the  vertex  y. 

If  the  edge  e  =  (y,  x)  G  P  is  (/,  /)- heavy,  then 


d(y)  =  d(x)  -  Ce(fe) 

<  d(x)  +  i  ■  d(t)  —  ce(/e) 

<  d(y)  +  i  ■  d(t), 


(11) 

(12) 

(13) 


where  equation  (11)  follows  from  Proposition  3.14  and  the  fact  that  fe  >  0,  inequality  (12) 
follows  from  the  inductive  hypothesis  and  the  fact  that  fe  <  fe,  and  inequality  (13)  follows 
from  Proposition  3.14. 

In  either  case,  the  inner  inductive  step  (10)  holds.  This  completes  the  proof  of  the  outer 
inductive  step  (8)  and  of  the  theorem.  ■ 


3.3  Multicommodity  Networks 

So  far,  this  section  has  only  studied  Braess’s  Paradox  in  single-commodity  networks.  We  next 
briefly  survey  very  recent  results  of  Lin  et  al.  [62]  on  Braess’s  Paradox  in  multicommodity 
networks.  We  define  the  Braess  ratio  for  such  networks  as  follows.  For  a  multicommodity 
instance  (G,r,c)  and  a  commodity  i,  let  di(G,r,c )  denote  the  common  cost  incurred  by  all 
traffic  of  commodity  i  in  a  Wardrop  equilibrium  for  (G,r,c).  Note  d*(G,r,  c)  is  well  defined 
by  Definition  2.1  and  Proposition  2.3. 


Definition  3.16  The  Braess  ratio  /3(G,r,  c)  of  a  multicommodity  instance  (G,  r,  c)  is 

a,n  \  k '■  di{G,r,c ) 

p  G,  r,  c)  =  max  mm  - r, 

hcg  i= 1  di(H,r,c ) 


where  H  ranges  over  the  subnetworks  of  G  that  contain  an  Si-U  path  for  each  i. 


19 


Thus  the  Braess  ratio  of  a  multicommodity  instance  is  large  only  if  removing  some  set  of 
edges  decreases  the  cost  incurred  by  the  traffic  of  every  commodity  by  a  large  amount. 
Definitions  3.1  and  3.16  coincide  in  single-commodity  networks. 

Remark  3.17  Since  removing  edges  from  a  multicommodity  network  affects  traffic  from 
different  commodities  in  different  ways,  there  are  several  possible  measures  for  the  severity 
of  Braess’s  Paradox  in  such  networks.  For  example,  one  natural  measure  is  given  by  the 
same  defining  equation  (7)  as  in  the  definition  of  the  Braess  ratio  for  single-commodity 
networks  (Definition  3.1).  Unfortunately,  while  Proposition  3.3  still  holds  for  this  measure, 
no  interesting  bounds  are  possible  for  networks  with  arbitrary  cost  functions:  even  in  two- 
commodity,  three-node  networks,  removing  a  single  edge  can  decrease  the  cost  of  a  Wardrop 
equilibrium  by  an  arbitrarily  large  factor. 

The  upper  bound  on  the  Braess  ratio  in  Theorem  3.7  does  not  carry  over  to  multicom¬ 
modity  networks:  Lin  et  al.  [62]  showed  that  the  Braess  ratio  can  grow  exponentially  with 
the  network  size,  even  in  two-commodity  networks. 

Theorem  3.18  ([62])  There  is  a  family  of  two-commodity  networks  {(Gn,  rn,  cn)}^=1  such 
that  Gn  has  0(n)  vertices  and  edges  and  fd(Gn1rn1cn )  =  2nG)  as  n  — »  oc. 

In  fact,  the  construction  in  the  proof  of  Theorem  3.18  shows  the  following:  adding  a  single 
edge  to  a  two-commodity  network  ( G”,  rn,  cn )  with  0(n)  vertices  and  edges,  d\{Gn,  rn1  cn)  = 
0,  and  d2(Gn,  rn,  c")  =  1  can  increase  the  common  cost  incurred  by  traffic  of  the  two  com¬ 
modities  to  roughly  the  (n  —  l)th  and  nth  Fibonacci  numbers,  respectively. 

On  the  other  hand,  the  Braess  ratio  is  always  at  most  exponential  in  the  network  size. 

Theorem  3.19  ([62])  There  is  a  constant  c  >  0  such  that  for  every  k,n  >  1  and  every 
instance  (G,r,c)  with  k  commodities  and  n  vertices,  / 3(G,r,c )  <  2ckn. 

The  proof  of  Theorem  3.19  actually  shows  the  stronger  statement  that  if  /  is  a  Wardrop 
equilibrium  for  the  ^-commodity,  n- vertex  instance  (G,r,  c)  and  /  is  feasible  for  (G,r,  c), 
then  the  maximum  cost  max,  d*(G,  r,  c)  incurred  by  traffic  in  /  is  2 °(kn)  times  the  maximum 
cost  incurred  by  traffic  in  /.  The  question  of  whether  or  not  the  largest-possible  Braess  ratio 
of  multicommodity  networks  depends  on  the  number  of  commodities  is  open. 

3.4  Detecting  Braess’s  Paradox  Is  Hard 

Previous  results  of  this  section  were  devoted  to  the  analysis  of  the  worst-case  severity  of 
Braess’s  Paradox.  Braess’s  Paradox  also  suggests  a  natural  algorithmic  question:  given  a 
network,  is  it  suffering  from  the  paradox?  If  so,  which  edges  should  be  removed  to  recover 
the  best-possible  Wardrop  equilibrium? 

This  innocuous  question  turns  out  to  be  extremely  difficult  to  answer,  in  a  sense  we 
make  precise  below.  To  keep  things  simple,  we  will  initially  consider  only  single-commodity 
networks  with  linear  cost  functions.  Detecting  Braess’s  Paradox  can  be  phrased  as  an  op¬ 
timization  problem  as  follows:  given  a  single-commodity  instance  (G,  r,  c)  with  linear  cost 
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functions,  find  a  subnetwork  that  minimizes  the  cost  of  a  Wardrop  equilibrium  for  (if,  r,  c) 
over  all  subnetworks  if  C  G.  We  call  this  optimization  problem  Linear  Network  Design. 

Linear  Network  Design  can  be  solved  by  enumerating  all  subgraphs  if  of  G,  comput¬ 
ing  a  Wardrop  equilibrium  in  each,  and  picking  the  best  solution.  (Since  Wardrop  equilibria 
are  the  minima  of  the  convex  function  in  (1),  one  can  be  computed  using  convex  program¬ 
ming.)  On  the  other  hand,  there  may  be  an  exponential  number  of  candidate  subnetworks 
if.  How  well  can  we  solve  this  optimization  problem  if  we  use  only  a  reasonable  amount  of 
computational  resources? 

We  will  use  basic  concepts  of  computational  complexity  theory  as  described  in,  for  ex¬ 
ample,  Garey  and  Johnson  [44],  Recall  that  a  7 -approximation  algorithm  for  a  minimization 
problem  runs  in  polynomial  time  and  returns  a  solution  no  more  than  7  times  as  costly  as 
an  optimal  solution.  The  value  7  is  the  approximation  ratio  or  performance  guarantee  of  the 
algorithm. 

While  we  would  obviously  like  to  solve  Linear  Network  Design  optimally  in  polyno¬ 
mial  time,  a  natural  weaker  goal  is  to  design  a  7-approximation  algorithm  with  7  as  close  to  1 
as  possible.  Of  course,  even  the  trivial  algorithm ,  which  always  returns  the  entire  network 
G,  can  be  viewed  as  an  approximation  algorithm  for  Linear  Network  Design.  Because 
the  Braess  ratio  of  every  network  with  linear  cost  functions  is  at  most  4/3  (Proposition  3.3), 
we  have  the  following  guarantee  on  the  trivial  algorithm. 

Proposition  3.20  The  trivial  algorithm  is  a  approximation  algorithm  for  Linear  Net¬ 
work  Design. 

Needless  to  say,  we  should  aspire  to  design  better,  more  clever  approximation  algorithms. 
Alas,  none  exist,  assuming  P  ^  NP. 

Theorem  3.21  ([78])  For  every  e  >  0,  there  is  no  (|  —  e) -approximation  algorithm  for 
Linear  Network  Design  (unless  P  =  NP). 

Proof:  We  give  a  polynomial-time  “gap  reduction”  from  the  NP-complete  problem  2  Di¬ 
rected  Disjoint  Paths  (2DDP)  [40]:  given  a  directed  graph  G  =  (V.  E)  and  distinct 
vertices  sq,  s2,  G,  G  G  V.  are  there  Sj-G  paths  P%  for  i  =  1,  2,  such  that  P\  and  P2  are  vertex- 
disjoint?  We  can  prove  the  theorem  by  showing  how  a  (|  —  e)-approximation  algorithm  for 
Linear  Network  Design  can  be  used  to  differentiate  between  “yes”  and  “no”  instances 
of  2DDP  in  polynomial  time. 

Consider  an  instance  T  of  2DDP,  as  above.  Augment  the  vertex  set  V  by  an  additional 
source  s  and  sink  t,  and  include  the  directed  edges  (s,si),  (s,s2),  {ti,t),  and  (f2,f)  (see 
Figure  6).  Denote  the  new  network  by  G'  =  ( V\E ')  and  endow  the  edges  of  E'  with  the 
following  linear  cost  functions  c:  edges  of  E  are  given  the  cost  function  c(x )  =  0,  edges  (s,  s2) 
and  (G,  t)  are  given  the  cost  function  c(x )  =  x,  and  edges  (s,  «i)  and  (t2,  t)  are  given  the  cost 
function  c(x)  =  1.  The  instance  (G',  1,  c)  can  be  constructed  from  1  in  polynomial  time. 

We  can  complete  the  proof  by  establishing  two  statements:  if  1  is  a  “yes”  instance  of 
2DDP,  then  G'  admits  a  subnetwork  H  such  that  a  Wardrop  equilibrium  for  (if,  1,  c)  has  cost 
3/2;  and  if  1  is  a  “no”  instance,  then  for  every  subnetwork  H  of  G',  a  Wardrop  equilibrium 
for  (if,  1,  c)  has  cost  at  least  2. 
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Figure  6:  Proof  of  Theorem  3.21.  In  a  “no”  instance  of  2DDP,  the  existence  of  si-ti  and 
s2-t2  paths  implies  the  existence  of  an  s2-t\  path. 


First  suppose  there  are  vertex-disjoint  si-ti  and  s2-t2  paths  P\  and  P2  in  G,  respectively. 
Obtain  H  by  deleting  all  edges  of  G  not  contained  in  some  Pj.  Then,  H  is  a  subgraph  of 
G'  with  exactly  two  s-t  paths,  and  routing  half  a  unit  of  flow  along  each  yields  a  Wardrop 
equilibrium  with  cost  3/2  (cf.,  Figure  2(a)). 

Now  suppose  1  is  a  “no”  instance  and  consider  a  subgraph  H  of  G' .  We  can  assume 
that  H  contains  an  s-t  path.  If  H  has  an  s-t  path  P  containing  an  s2-ti  path,  then  routing 
all  of  the  flow  on  P  yields  a  Wardrop  equilibrium  with  cost  2  (cf.,  Figure  2(b)).  Otherwise, 
since  1  is  a  “no”  instance  of  2D  DP,  two  sole  possibilities  remain  (see  Figure  6):  either  for 
precisely  one  i  G  {1,  2},  H  has  an  s-t  path  P  containing  an  Si-U  path,  or  all  s-t  paths  P  in 
H  contain  an  s\-t2  path  of  G.  In  either  case,  routing  one  unit  of  flow  along  such  a  path  P 
provides  a  Wardrop  equilibrium  with  cost  2.  ■ 

Thus,  no  polynomial-time  algorithm  for  Linear  Network  Design  has  an  approxima¬ 
tion  ratio  superior  to  that  of  the  trivial  algorithm.  Equivalently,  it  is  NP-hard  to  distinguish 
between  “paradox- free”  instances  (with  Braess  ratio  1)  and  instances  suffering  from  the  most 
severe  manifestations  of  the  paradox  (with  Braess  ratio  4/3). 

While  we  have  only  established  the  optimality  of  the  trivial  algorithm  for  networks  with 
linear  cost  functions,  similar  results  hold  with  other  sets  of  allowable  edge  cost  functions.  For 
example,  let  General  Network  Design  be  the  analogous  optimization  problem  for  single¬ 
commodity  networks  with  arbitrary  cost  functions.  Theorem  3.7  implies  that  the  trivial 
algorithm  is  a  [n/2j -approximation  algorithm  for  General  Network  Design,  where  n  is 
the  number  of  network  vertices.  On  the  other  hand,  the  following  inapproximability  result 
holds. 

Theorem  3.22  ([78])  Assuming  P  ^  NP,  for  every  e  >  0  there  is  no  ([n/2]  —  e)- 
approximation  algorithm  for  General  Network  Design. 

The  proof  of  Theorem  3.22  is  somewhat  involved  and  makes  use  of  the  Braess  graphs  that 
were  introduced  in  the  proof  of  Theorem  3.4.  For  the  proof,  and  similar  results  for  other  sets 
of  allowable  cost  functions,  see  [78].  For  analogous  intractability  results  for  multicommodity 
networks,  which  build  on  the  two-commodity  networks  alluded  to  in  Theorem  3.19,  see  Lin 
et  al.  [62], 
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4  Coping  with  Selfishness:  How  To  Reduce  the  Price 
of  Anarchy 

We  have  seen  that  the  price  of  anarchy  of  selfish  routing  can  be  large  in  networks  with 
highly  nonlinear  cost  functions,  including  with  functions  that  are  common  in  applications, 
such  as  M/M/1  delay  functions.  This  final  technical  section  asks:  other  than  somehow 
enforcing  optimal  routing,  what  can  we  do  about  it?  Can  modest  intervention,  when  feasible, 
significantly  reduce  the  price  of  anarchy?  We  briefly  discuss  three  techniques  for  mitigating 
the  inefficiency  of  selfish  routing:  increasing  the  capacity  of  the  network  (Subsection  4.1), 
routing  a  small  amount  of  traffic  centrally  (Subsection  4.2),  and  influencing  traffic  with  edge 
taxes  (Subsection  4.3). 

4.1  Capacity  Augmentation 

For  the  rest  of  this  survey  we  study  networks  with  arbitrary  cost  functions,  where  the  price 
of  anarchy  is  unbounded.  We  next  show  that  a  bound  on  the  inefficiency  of  selfish  routing 
in  such  networks  is  nonetheless  possible,  via  a  so-called  bicriteria  approach.  Specifically,  our 
next  result  is  that  the  cost  of  a  Wardrop  equilibrium  is  at  most  that  of  an  optimal  flow 
that  is  forced  to  route  twice  as  much  traffic  between  each  source-sink  pair.  We  will  see  that 
this  result  has  the  following  alternative  interpretation:  in  lieu  of  centralized  control,  the 
inefficiency  of  selfish  routing  can  be  offset  by  a  moderate  increase  in  link  speed. 

Example  4.1  Consider  the  nonlinear  variant  of  Pigou’s  example  (Figure  1(b)):  a  two-node, 
two-link  network  with  cost  functions  c(x )  =  1  and  c(x)  =  xp  for  p  large.  Recall  that  with  one 
unit  of  traffic,  the  Wardrop  equilibrium  routes  all  flow  on  the  lower  edge,  while  the  optimal 
flow  routes  e  units  of  flow  on  the  upper  edge  and  the  rest  on  the  lower  edge  (where  e  — )•  0 
as  p  — y  oo).  When  the  traffic  rate  r  exceeds  one,  an  optimal  flow  assigns  the  additional 
r  —  1  units  of  traffic  to  the  upper  link,  incurring  a  cost  that  tends  to  r  —  1  as  p  — >  oo.  In 
particular,  for  every  p  an  optimal  flow  feasible  for  twice  the  original  traffic  rate  (r  =  2)  has 
cost  at  least  1,  which  equals  the  cost  of  the  Wardrop  equilibrium  in  the  original  instance. 

We  now  show  that  the  bound  stated  in  Example  4.1  holds  for  all  instances. 

Theorem  4.2  ([89])  If  f  is  a  Wardrop  equilibrium  for  (G,  r,  c)  and  f*  is  feasible  for  ( G ,  2 r,  c), 
then 

c(S)  <  c(f). 

Proof:  Let  /  and  f*  denote  a  Wardrop  equilibrium  for  (G,  r,  c)  and  a  feasible  flow  for 
(G,  2 r,  c),  respectively.  For  each  commodity  i,  let  d*(G,  r,  c)  denote  the  common  cost  incurred 
by  the  traffic  of  commodity  i  in  the  flow  /  (see  Definition  3.16).  Definition  2.1  and  the 
definition  of  cost  (3)  imply  that  G(/)  =  J2irid'i(G1r:c). 

The  key  idea  is  to  define  a  set  of  cost  functions  c  that  satisfies  two  properties:  lower 
bounding  the  cost  of  /*  relative  to  that  of  /  is  easy  with  respect  to  c;  and  the  new  cost 
functions  c  approximate  the  original  ones  c,  in  the  sense  that  the  cost  of  f*  with  respect  to 
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(a)  Graph  of  the  cost  function  ce  and  (b)  Graph  of  the  cost  function  ce 

its  value  at  the  flow  value  fe 


Figure  7:  Construction  in  the  proof  of  Theorem  4.2  of  the  modified  cost  function  ce  given 
the  original  cost  function  ce  and  the  Wardrop  equilibrium  value  fe.  Solid  lines  denote  graphs 
of  functions. 


c  is  close  to  its  original  cost, 
follows: 


Specifically,  we  define  the  cost  function  ce  for  each  edge  e  as 


f  Ce(/e)  If  X  < 

\  Ce(x)  if  X  >  fe 


Figure  7  illustrates  this  construction.  Let  C(-)  denote  the  cost  of  a  flow  in  the  instance 
( G,r,c ).  Note  that  C(f*)  >  C(f*)  while  C(f)  =  C(f). 

We  hrst  upper  bound  the  amount  by  which  the  new  cost  C(f*)  of  /*  can  exceed  its 
original  cost  C(f*).  For  every  edge  e,  ce(x)  —  ce(x)  is  zero  for  x  >  fe  and  bounded  above  by 

Ce(fe)  for  X  <  fe,  SO 

x(ce(x)  -  Ce{x))  <  Ce(fe)fe  (14) 


for  all  x  >  0.  The  left-hand  side  of  (14)— the  discrepancy  between  xce(x)  and  xce(x) — is 
maximized  when  x  is  slightly  smaller  than  fe  and  when  ce(x)  =  0.  In  this  case,  the  value  of 
the  left-hand  side  of  (14)  is  essentially  the  area  of  the  rectangle  enclosed  by  dashed  lines  in 
Figure  7(a),  which  in  turn  is  the  cost  incurred  by  the  Wardrop  equilibrium  /  on  the  edge  e. 
Thus  ___ 

<?(/■)  -  c(/-) = E  £fe(/:)  -  c,(m  <  E  c.u.)f. = cu).  (is) 

e£E  e£E 


In  other  words,  evaluating  f*  with  cost  functions  c,  rather  than  c,  increases  its  cost  by  at 
most  an  additive  C(f)  factor. 

Now  we  lower  bound  C(f*).  By  construction,  the  modified  cost  ce(-)  of  an  edge  e  is 
always  at  least  ce(/e),  so  the  modified  cost  cp(-)  of  a  path  P  e  Vi  is  always  at  least  cP(f ), 
which  in  turn  is  at  least  di(G,r,c).  Therefore, 


<?(/*)  =  E  cp(f‘)f‘p  >EE  di(G,r,c)fp  =  E  2r, d((G,r,c)  =  2  C(f).  (16) 


per 


i  i  PeVi 


i=  1 
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The  theorem  now  follows  immediately  from  inequalities  (15)  and  (16).  ■ 

Remark  4.3  Example  4.1  shows  that  the  bound  in  Theorem  4.2  is  the  best  possible. 

Another  interpretation  of  Theorem  4.2  is  that  the  benefit  of  centralized  control  is  equaled 
or  exceeded  by  the  benefit  of  a  sufficient  improvement  in  link  technology. 

Corollary  4.4  ([82])  Let  (G,  r,  c)  be  an  instance  and  define  the  modified  cost  function  ce 
by  ce(x)  =  ce(x/ 2)/2  for  each  edge  e.  Let  f  be  a  Wardrop  equilibrium  for  (G,r,c)  with  cost 
C(f),  and  f*  a  feasible  flow  for  (G,r,c)  with  cost  C(f*).  Then  C(f)  <  C(f*). 

Simple  calculations  show  that  Theorem  4.2  and  Corollary  4.4  are  equivalent;  for  details, 
see  [82]  or  [86]. 

Corollary  4.4  takes  on  a  particularly  nice  form  in  instances  in  which  all  cost  functions 
are  M/M/1  delay  functions  (Example  2.11).  In  this  case,  if  the  cost  function  ce  of  edge  e  is 
ce(x)  =  (ue  —  x)-1,  then  the  modified  function  ce  is  ce(x)  =  1/2 (ue  —  x/2)  =  1/(2 ue  —  x). 
Corollary  4.4  thus  offers  the  following  advice  for  networks  where  cost  functions  are  M/M/1 
delay  functions  and  capacity  is  cheap:  to  outperform  optimal  routing,  just  double  the  capacity 
of  every  edge. 

4.2  Stackelberg  Routing 

A  second  approach  to  reducing  the  price  of  anarchy,  also  explored  in  [82],  is  to  allow  a 
small  portion  of  the  network  traffic  to  be  routed  centrally.  We  will  call  this  Stackelberg 
routing ,  after  a  concept  from  noncooperative  game  theory  called  Stackelberg  games  [98].  In 
the  interest  of  space,  we  only  describe  our  model  of  Stackelberg  routing  informally,  via  two 
examples. 

Example  4.5  To  understand  the  potential  power  of  Stackelberg  routing,  consider  the  non¬ 
linear  variant  of  Pigou’s  example  (Figure  1(b))  with  p  large.  Suppose  we  are  granted  the 
ability  to  route  a  7  G  [0, 1]  fraction  of  the  traffic  as  we  wish,  knowing  that  the  other  (1  —  7) 
fraction  of  the  traffic  will  then  choose  routes  selfishly,  as  usual.  (Definition  2.1  thus  governs 
the  routes  chosen  by  selfish  traffic,  but  not  by  the  centrally  routed  traffic.)  We  will  call 
a  routing  of  the  centrally  controlled  traffic  a  Stackelberg  strategy.  Observe  that  for  every 
Stackelberg  strategy,  the  selfish  traffic  will  use  the  lower  edge — the  upper  route  is  never 
attractive  to  selfish  users,  even  if  the  lower  one  is  fully  congested.  On  the  other  hand,  if  we 
route  some  traffic  on  the  upper  edge  ourselves,  the  cost  of  the  overall  solution  decreases.  In 
particular,  if  7  is  sufficiently  large,  we  can  mimic  the  optimal  flow  on  the  upper  edge  (routing 
excess  traffic  on  the  lower  edge)  and  induce  the  optimal  flow.  Thus  Stackelberg  routing  can 
decrease,  or  even  eradicate,  the  inefficiency  of  selfish  routing  in  this  example. 

Example  4.6  Stackelberg  routing  also  has  its  limitations.  Suppose  we  modify  Example  4.5 
by  replacing  the  cost  function  c(x)  =  xp  of  the  lower  edge  in  Figure  1(b)  by  the  cost  function 
c(x)  =  xp  f  (1  —  7 )p,  where  7  is  the  fraction  of  traffic  that  we  are  permitted  to  route  centrally. 
The  key  observation  is  that  no  matter  how  the  centrally  controlled  traffic  is  routed,  there  is 
enough  selfish  traffic  to  fully  congest  the  lower  edge.  Therefore,  Stackelberg  strategies  that 
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route  at  most  7  units  of  traffic  cannot  produce  a  flow  with  cost  less  than  1.  On  the  other 
hand,  the  optimal  flow,  which  routes  7  +  e  units  of  flow  on  the  upper  edge  and  the  rest  on 
the  lower  edge,  has  cost  approaching  7  as  p  — »  00  and  e  — »■  0. 

Stackelberg  routing  was  first  proposed  by  Korilis,  Lazar,  and  Orda  [58],  who  were  motivated 
by  so-called  virtual  private  networks  (see  e.g.  Birman  [12]  for  a  discussion  of  VPNs).  The 
main  goal  in  [58]  was  to  characterize  the  instances  in  which  some  Stackelberg  strategy  induces 
an  optimal  flow  (as  in  Example  4.5).  This  problem  has  also  been  studied  more  recently  by 
Kaporis,  Politopoulou,  and  Spirakis  [54],  Here  we  follow  Roughgarden  [85]  and  seek  worst- 
case  bounds  on  the  ratio  between  the  cost  of  the  best  flow  possible  with  Stackelberg  routing 
and  that  of  an  optimal  flow.  Example  4.6  shows  that,  for  each  7  G  (0, 1],  this  ratio  can  be 
arbitrarily  close  to  I/7,  even  in  two-node,  two-link  networks.  One  of  the  main  results  of  [85] 
is  a  matching  upper  bound  for  networks  of  parallel  links. 

Theorem  4.7  ([85])  For  every  instance  ( G,r,c )  with  a  network  of  parallel  links  and  every 
7  G  (0, 1],  there  is  a  Stackelberg  strategy  that  routes  7 r  units  of  traffic  and  yields  a  flow  with 
cost  at  most  I/7  times  the  cost  of  an  optimal  flow  for  ( G ,  r,  c). 

Theorem  4.7  provides  a  smooth  trade-off  between  optimal  flows  and  Wardrop  equilibria,  as 
a  function  of  the  fraction  of  centrally  controlled  traffic.  When  7  =  0,  we  are  stuck  with 
a  Wardrop  equilibrium,  which  can  cost  arbitrarily  more  than  an  optimal  flow  in  a  network 
with  arbitrary  cost  functions.  When  7  =  1  and  we  control  all  of  the  traffic,  we  can  of  course 
route  the  traffic  optimally.  Example  4.6  and  Theorem  4.7  precisely  quantify  the  inefficiency 
of  selfish  routing  for  all  intermediate  values  of  7  (in  networks  of  parallel  links). 

Remark  4.8  The  proof  of  Theorem  4.7  is  constructive,  and  uses  a  simple  iterative  algorithm 
to  compute  a  good  Stackelberg  strategy.  This  algorithm  runs  in  polynomial  time  as  long  as 
the  network  cost  functions  satisfy  a  mild  convexity  condition  (see  [85]  for  details).  While  this 
algorithm  is  sufficient  to  obtain  the  best-possible  worst-case  guarantee  in  Theorem  4.7,  it  does 
not  compute  the  optimal  Stackelberg  strategy  in  every  instance.  Indeed,  the  optimization 
problem  of  computing  an  optimal  Stackelberg  strategy  is  NP-hard  [85],  though  it  can  be 
closely  approximated  in  polynomial  time  [60]. 

Theorem  4.7  applies  only  to  networks  of  parallel  links,  and  the  power  of  Stackelberg  rout¬ 
ing  in  more  general  networks  is  not  fully  understood.  The  I/7  upper  bound  of  Theorem  4.7 
does  not  hold  in  general  single-commodity  networks,  and  no  interesting  bounds  are  possi¬ 
ble  in  multicommodity  networks  [86].  Very  recently,  Fleischer  and  Swamy  [39]  proved  an 
analogue  of  Theorem  4.7,  with  I/7  replaced  by  a  somewhat  larger  function  of  7,  for  a  wide 
class  of  networks,  including  series-parallel  networks  and  the  Braess  graphs  of  Subsection  3.1. 
The  question  of  whether  or  not  such  a  result  holds  for  general  single-commodity  networks  is 
open. 

4.3  Pricing  Network  Edges 

We  conclude  with  a  third,  very  natural  approach  to  reducing  the  price  of  anarchy  of  selfish 
routing:  influencing  selfish  behavior  with  edge  taxes.  While  not  discussed  in  [82],  this  idea 
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has  been  extensively  studied  since  the  earliest  papers  on  selfish  routing.  The  literature  on 
pricing  selfish  routing  networks  is  vast,  and  we  will  confine  our  attention  to  only  one  classical 
result  and  two  currently  active  research  directions.  See  Yang  and  Huang  [103],  for  example, 
for  an  introduction  to  this  research  area. 

Pigou  [72]  suggested  what  are  often  called  marginal  cost  taxes  or  Pigouvian  taxes.  The 
idea  of  marginal  cost  pricing  is  to  charge  each  network  user  on  each  edge  for  the  additional 
cost  its  presence  causes  for  the  other  users  of  the  edge.  To  discuss  this  idea  formally,  we  now 
allow  each  edge  e  of  a  selfish  routing  network  to  possess  a  nonnegative  tax  re.  We  denote  a 
selfish  routing  instance  (G,r,  c)  with  edge  taxes  r  by  (G,r,  c  +  r).  A  Wardrop  equilibrium 
for  such  an  instance  (G,r,  c  +  r)  is  defined  as  in  Definition  2.1,  with  all  traffic  traveling  on 
routes  that  minimize  the  sum  of  the  edge  costs  and  edge  taxes.  Equivalently,  it  is  a  Wardrop 
equilibrium  for  the  instance  (G,r,  cT),  where  the  cost  function  cT  is  a  shifted  version  of  the 
original  cost  function  ce:  cf(x )  =  ce(x)  +  re  for  all  x  >  0. 

Mathematically,  the  principle  of  marginal  cost  pricing  asserts  that  for  a  flow  /  feasible 
for  an  instance  (G,  r,  c),  the  tax  re  assigned  to  the  edge  e  should  be  re  =  fe  ■  c'e(/e),  where  c'e 
denotes  the  derivative  of  ce.  (Assume  for  simplicity  that  the  cost  functions  are  differentiable.) 
The  term  c'e(fe)  corresponds  to  the  marginal  increase  in  cost  caused  by  one  user  of  the  edge, 
and  the  term  fe  is  the  amount  of  traffic  that  suffers  from  this  increase.  Pigou  [72]  suggested 
that  these  taxes  should  eliminate  all  of  the  inefficiency  of  selfish  routing,  and  Beckmann, 
McGuire,  and  Winsten  [8]  made  this  idea  rigorous. 

Proposition  4.9  ([8,  72])  Let  (G,  r,  c)  be  an  instance  with  differentiable  cost  functions, 
admitting  an  optimal  flow  f*.  Let  re  =  /*  •  c'e(/* )  denote  the  marginal  cost  tax  for  edge  e 
with  respect  f*.  Then  f*  is  a  Wardrop  equilibrium  for  (G,  r,  c  +  r). 

In  words,  marginal  cost  taxes  induce  an  optimal  flow  as  a  Wardrop  equilibrium. 

While  Proposition  4.9  may  appear  to  be  a  complete  solution  to  the  problem  of  reducing 
the  price  of  anarchy  of  selfish  routing,  it  possesses  several  drawbacks.  Two  of  these  have 
recently  motivated  much  research  in  the  theoretical  computer  science  and  mathematical 
programming  communities. 

First,  the  definition  of  a  Wardrop  equilibrium  in  Proposition  4.9  implicitly  assumes  that 
all  network  users  trade  off  cost  and  taxes  in  an  identical  way.  For  example,  if  edge  costs 
represent  travel  time,  some  users  might  be  more  sensitive  to  time  delays,  while  others  are 
concerned  primarily  with  monetary  expenses.  Several  papers  have  considered  the  following 
model  of  heterogeneous  traffic:  each  network  user  chooses  a  path  that  minimizes  a  weighted 
sum  of  the  edge  costs  and  the  edge  taxes.  In  other  words,  the  preferences  of  a  network  user 
are  summarized  by  a  single  scalar — the  monetary  value  to  the  user  of  one  unit  of  cost.  This 
model  was  first  proposed  and  studied  in  the  transportation  science  literature  [30,  32,  74], 
but  analogues  of  Proposition  4.9  were  only  recently  given  for  heterogeneous  traffic.  Cole, 
Dodis,  and  Roughgarden  [22]  proved  that,  in  single-commodity  networks  with  heterogeneous 
traffic,  there  is  always  a  set  of  taxes  that  induces  the  optimal  flow  as  a  Wardrop  equilibrium. 
This  result  was  extended  to  multicommodity  networks  independently  by  Fleischer,  Jain,  and 
Mahdian  [38],  Karakostas  and  Kolliopoulos  [56],  and  Yang  and  Huang  [102], 

Second,  in  networks  where  cost  functions  can  have  large  derivatives,  the  marginal  cost 
taxes  of  Proposition  4.9  can  be  extremely  large.  Several  solutions  have  recently  been  proposed 
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for  this  problem,  including  computing  a  tax  that  induces  an  optimal  flow  but  also  minimizes 
the  taxes  paid  [9,  48];  incorporating  the  taxes  paid  into  the  objective  function  [22,  24];  and 
proving  worst-case  bounds  on  the  largest  tax  needed  to  induce  an  optimal  flow  [22,  37,  38]. 


5  Recent  Related  Work 

This  survey  describes  the  basic  results  on  the  price  of  anarchy  of  selfish  routing.  However, 
we  have  only  scratched  the  surface  of  a  broader  issue:  quantifying  the  inefficiency  of  nonco¬ 
operative  equilibria  in  applications  with  selfish  users.  This  fundamental  problem  has  only 
recently  been  systematically  studied,  but  there  is  already  a  large  literature  addressing  many 
aspects  of  it.  We  conclude  this  survey  by  briefly  discussing  some  of  the  recent  work  in  this 
lively  research  area. 

First,  the  price  of  anarchy  has  been  analyzed  in  numerous  variants  and  generalizations  of 
the  basic  selfish  routing  model  studied  in  this  survey.  Several  recent  papers  have  extended 
Theorem  2.17  to  more  general  classes  of  games  [17,  70,  90].  The  price  of  anarchy  of  selfish 
routing  has  also  been  studied  with  objective  functions  other  than  (3)  [26,  62,  80,  84,  101]; 
with  edge  capacities  and  other  types  of  “side  constraints”  [27,  49,  55];  when  the  traffic  rates 
can  vary  with  the  network  congestion  [17,  23];  when  network  users  can  have  non-negligible 
size  [3,  6,  7,  21,  25,  28,  41,  57,  87,  89,  95];  and  with  definitions  of  path  cost  cP(f )  other  than 
the  sum  of  all  edge  costs  [7,  23]. 

Second,  the  price  of  anarchy  is  a  very  general  concept — applicable  to  every  noncooperative 
game  with  a  notion  of  equilibrium  and  a  nonnegative  objective  function.  In  games  where 
different  equilibria  can  have  different  objective  function  values,  the  price  of  anarchy  is  usually 
defined  as  the  ratio  between  the  objective  function  value  of  the  worst  equilibrium  and  that  of 
an  optimal  solution  [59].  The  related  concept  of  the  price  of  stability  [3]  instead  considers  the 
objective  function  value  of  the  best  equilibrium.  The  price  of  anarchy  and  price  of  stability 
have  been  successfully  analyzed  in  a  diverse  array  of  applications  with  selfish  users  over  the 
past  few  years.  These  include  scheduling  (see  [29,  35]  and  the  references  therein),  facility 
location  [31,  45,  64,  97],  network  design  [3,  4,  18,  34,  36],  resource  allocation  [50,  52,  104], 
and  other  networking  games  [1,  5,  46,  51]. 

Third,  researchers  have  begun  to  study  the  inefficiency  of  different  notions  of  a  selfish 
outcome.  For  example,  Goemans,  Mirrokni,  and  Vetta  [45,  64]  have  extended  the  concept 
of  the  price  of  anarchy  to  games  in  which  equilibria  need  not  exist.  The  work  in  [45,  64]  is 
also  motivated  by  the  important  problem  of  understanding  when  a  small  price  of  anarchy 
implies  that  selfish  users  can  “learn”,  by  independent  and  repeated  experimentation  from 
an  arbitrary  initial  state,  an  approximately  optimal  outcome.  Another  example  is  given  by 
Christodoulou  and  Koutsoupias  [20],  who  studied  the  inefficiency  of  correlated  equilibria  in 
scheduling  games. 

Finally,  an  emerging  research  direction  is  to  use  the  price  of  anarchy  as  a  measure  for  the 
performance  of  a  network  protocol  that  interacts  with  selfish  users.  This  idea  connects  the 
analysis  of  the  inefficiency  of  game-theoretic  equilibria  with  mechanism  design ,  a  classical 
subfield  of  microeconomics  that  studies  how  to  design  games  that  possess  equilibria  with 
good  properties  (see  e.g.  [63,  Chapter  23]  or  [68,  Chapter  10]).  For  example,  Johari  [50, 
Chapter  5]  considers  a  class  of  network  resource  allocation  protocols,  each  of  which  can  be 
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viewed  as  a  game  with  selfish  users,  and  proves  that  a  natural  “proportional  sharing”  protocol 
minimizes  the  (worst-case)  inefficiency  of  equilibria.  A  second  example  is  the  recent  work 
by  Chen,  Roughgarden,  and  Valiant  [19]  that  analyzes  how  the  price  of  stability  in  a  class 
of  network  design  games  [3]  depends  on  the  choice  of  an  underlying  cost-sharing  protocol. 
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